An A Priori Algorithm R Example

advertisement
An A Priori Algorithm R Example Loading required package: arules
Loading required package: Matrix
Attaching package: ‘arules’
The following objects are masked from ‘package:base’:
%in%, write
>
>
>
>
>
>
>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
.
#Example of Association Rules
#Here is one (crude) way to prepare a list of transactions for the 602X Problem
#require(arules)
a_list<-list(
c("CrestTP","CrestTB"),
c("OralBTB"),
c("BarbSC"),
c("ColgateTP","BarbSC"),
c("OldSpiceSC"),
c("CrestTP","CrestTB"),
c("AIMTP","GUMTB","OldSpiceSC"),
c("ColgateTP","GUMTB"),
c("AIMTP","OralBTB"),
c("CrestTP","BarbSC"),
c("ColgateTP","GilletteSC"),
c("CrestTP","OralBTB"),
c("AIMTP"),
c("AIMTP","GUMTB","BarbSC"),
c("ColgateTP","CrestTB","GilletteSC"),
c("CrestTP","CrestTB","OldSpiceSC"),
c("OralBTB"),
c("AIMTP","OralBTB","OldSpiceSC"),
c("ColgateTP","GilletteSC"),
c("OralBTB","OldSpiceSC"),
c(),
c(),
c(),
# Many similar rows have been deleted here
.
.
+
c(),
+
c(),
+
c(),
+
c(),
+
c()
+
+
)
>
> #Set transaction names
>
> names(a_list) <- paste("Tr",c(1:100), sep = "")
> a_list
$Tr1
[1] "CrestTP" "CrestTB"
$Tr2
[1] "OralBTB"
1 $Tr3
[1] "BarbSC"
$Tr4
[1] "ColgateTP" "BarbSC"
$Tr5
[1] "OldSpiceSC"
$Tr6
[1] "CrestTP" "CrestTB"
$Tr7
[1] "AIMTP"
"GUMTB"
"OldSpiceSC"
$Tr8
[1] "ColgateTP" "GUMTB"
$Tr9
[1] "AIMTP"
"OralBTB"
$Tr10
[1] "CrestTP" "BarbSC"
$Tr11
[1] "ColgateTP" "GilletteSC"
$Tr12
[1] "CrestTP" "OralBTB"
$Tr13
[1] "AIMTP"
$Tr14
[1] "AIMTP" "GUMTB" "BarbSC"
$Tr15
[1] "ColgateTP" "CrestTB"
"GilletteSC"
$Tr16
[1] "CrestTP"
"CrestTB"
"OldSpiceSC"
$Tr17
[1] "OralBTB"
$Tr18
[1] "AIMTP"
"OralBTB"
"OldSpiceSC"
$Tr19
[1] "ColgateTP" "GilletteSC"
$Tr20
[1] "OralBTB"
"OldSpiceSC"
$Tr21
NULL
$Tr22
NULL
$Tr23
NULL
.
#Many similar rows have been deleted here
.
.
$Tr98
NULL
$Tr99
NULL
$Tr100
NULL
>
> #Coerce into transactions
> trans <- as(a_list, "transactions")
There were 50 or more warnings (use warnings() to see the first 50)
>
> #Analyze transactions
>
2 > summary(trans)
transactions as itemMatrix in sparse format with
100 rows (elements/itemsets/transactions) and
9 columns (items) and a density of 0.04444444
most frequent items:
OralBTB
AIMTP
6
5
ColgateTP
5
CrestTP OldSpiceSC
5
5
(Other)
14
element (itemset/transaction) length distribution:
sizes
0 1 2 3
80 5 10 5
Min. 1st Qu.
0.0
0.0
Median
0.0
Mean 3rd Qu.
0.4
0.0
Max.
3.0
includes extended item information - examples:
labels
1
AIMTP
2
BarbSC
3 ColgateTP
includes extended transaction information - examples:
transactionID
1
Tr1
2
Tr2
3
Tr3
>
> inspect(trans)
items
transactionID
1
{CrestTB,
CrestTP}
Tr1
2
{OralBTB}
Tr2
3
{BarbSC}
Tr3
4
{BarbSC,
ColgateTP}
Tr4
5
{OldSpiceSC}
Tr5
6
{CrestTB,
CrestTP}
Tr6
7
{AIMTP,
GUMTB,
OldSpiceSC}
Tr7
8
{ColgateTP,
GUMTB}
Tr8
9
{AIMTP,
OralBTB}
Tr9
10 {BarbSC,
CrestTP}
Tr10
11 {ColgateTP,
GilletteSC}
Tr11
12 {CrestTP,
OralBTB}
Tr12
13 {AIMTP}
Tr13
14 {AIMTP,
BarbSC,
GUMTB}
Tr14
15 {ColgateTP,
CrestTB,
GilletteSC}
Tr15
16 {CrestTB,
CrestTP,
OldSpiceSC}
Tr16
3 17
18
19
20
{OralBTB}
{AIMTP,
OldSpiceSC,
OralBTB}
{ColgateTP,
GilletteSC}
{OldSpiceSC,
OralBTB}
{}
{}
{}
Tr17
Tr18
Tr19
Tr20
21
Tr21
22
Tr22
23
Tr23
.
#Many similar rows deleted here
.
.
98 {}
Tr98
99 {}
Tr99
100 {}
Tr100
>
> rules<-apriori(trans,parameter=list(supp=.02, conf=.5, target="rules"))
parameter specification:
confidence minval smax arem aval originalSupport support minlen maxlen target
ext
0.5
0.1
1 none FALSE
TRUE
0.02
1
10 rules FALSE
algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE
2
TRUE
apriori - find association rules with the apriori algorithm
version 4.21 (2004.05.09)
(c) 1996-2004
Christian Borgelt
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[9 item(s), 100 transaction(s)] done [0.00s].
sorting and recoding items ... [9 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [5 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
>
> inspect(head(sort(rules,by="lift"),n=20))
lhs
rhs
support confidence
lift
1 {GilletteSC} => {ColgateTP}
0.03 1.0000000 20.00000
2 {ColgateTP} => {GilletteSC}
0.03 0.6000000 20.00000
3 {CrestTB}
=> {CrestTP}
0.03 0.7500000 15.00000
4 {CrestTP}
=> {CrestTB}
0.03 0.6000000 15.00000
5 {GUMTB}
=> {AIMTP}
0.02 0.6666667 13.33333
4 
Download